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Assistant Commissioner of Patents 
United States Patent and Trademark Office 
Washington, D.C 20231 
ATTN: BOX PATENT APPLICATION 



Sir: 



Transmitted herewith for filing is the patent application of 
O Inventor(s): OLIVER, et al 

0 Entitled: SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

Hi 30 No. pages of specification, including title page, claims and abstract 
fy 13 No. sheets of ^X_informal, formal drawings 

enclosed are: 

Executed Combined Declaration and Power of Attorney for Patent Application 
An Original Executed Assignment of the Application 
Form PTO-1595 (Recordation Cover Sheet for Assignment) 
Verified Statement Claiming Small Entity Status with Cover Sheet 
An Information Disclosure Statement (Form PTO-1449A and Form PTO-1449B) 
A copy of References cited in Information Disclosure: documents 

FEES DUE 

The fees due for filing the application pursuant to 37 C.F.R. 1 . 1 6 and for recording the Assignment, if any, 
are determined as follow: 
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Basic Application Fee ($710 large entity; $355 small entity) 
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X$18 

X $ 9 (small) = 


36.00 


Total Independent 
Claims 


8 


Minus 3 = 
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XS80 

X $40 (small) = 


200.00 


If Multiple Dependent Claims are presented, add $260.00 or $130.00(small) 
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and is provided as follows: 

The Commissioner is hereby authorized to charge the fees associated with this communication or 

credit any overpayment to Deposit Account No: 500482 . A duplicate cony of this authorization is 
enclosed. 

X A Check No. 3533 for the above specified full fee is enclosed. However, in case 

Applicant inadvertently miscalculated any required fee, the Commissioner is hereby authorized to 
charge the necessary additional amount associated with this communication or credit any 
overpayment to Deposit Account No: 500482 . A duplicate copy of this authorization is enclosed. 

This application is filed pursuant to 37 C.F.R. 1.53 in the name of the above-identified Inventor(s). 
Please direct all correspondence concerning the above-identified application to the following address: 
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(650) 325-4999 
(650) 325-1203 : FAX 
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Dennis S. Eernand: 

REG. NO. 34,160 



11/27/2000 



Date 



The Assistant Commissioner of Patents 
United States Patent and Trademark Office 
Washington, D.C. 20231 
ATTN: Box Patent Application 

Re: U.S. Utility Patent Application 

Appl. No. (Not yet assigned); Filed 1 1/27/2000 

For: SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

Inventor(s): OLIVER, et al 
Docket No.: DYNA-P005 

Sir: 

The following documents are forwarded herewith for action by the U.S. Patent and Trademark Office: 

1 . U.S. UTILITY APPLICATION 

entitled: SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

having named inventor(s): 

OLIVER, et al 
a. a specification consisting: 

(i) _20 pages prior to the claims, including title page; 

f ] (ii) 9_ pages of claims; 

Jl (iii) 1 p age abstract; 

y b. JJL sheets of informal drawings: (FIGs.lA, IB, 2-9, 10A, 10B, 1 1-13 ); 

fll 2. An original, executed Combined Declaration and Power of Attorney by named inventors; 

[Jt 3 . Form PTO- 1 082 (in duplicate); 

4. Cover letter for Assignment (Form PTO- 1 595) 
^ 5 . An original, executed Assignment to DYNAPTICS CORPORATION , executed by named 

ffj inventors, recordation of which is hereby requested; 

6. A return post card; and 

* 7 . Check No . 3533 for $ _631.00 to cover: 

Patent application filing fee: $ 355.00 

H Assignment Recordation fee: $ 40.00 

fll Excess claims fee: $ 236.00 

ffl 8 . Verified Small Entity Status Statement with Cover Sheet 

>r-f 

•» " 

□ It is respectfully requested that the attached postcard be stamped with the filing date of the above 

documents and unofficial application number and returned to the addressee as soon as possible. 
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EMAIL: iploft@iploft.com 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
In re application of : 

Application No.: (Not vet assigned) Group No.: (Not vet assigned) 

Filed: _ 11/27/2000 Examiner: ( Not vet assigned) 

Title: SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

■ 

Inventor(s): OLIVER, et al 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Submission of verified statement 
to establish small entity status 

The attached statement is being submitted to establish small entity status in this 

_X_ Application 
Patent 



by the 

Independent inventor(s) 37 CFR 1.9(c) and 1.27(b) 

Non-inventor supporting claim by another 37 CFR 1.9(c) and 1.27(b) 

X Small Business Concern 37 CFR 1.9(d) and 1.27(c) 
Nonprofit Organization 37 CFR 1.9(e) and 1.27(d) 

Respectfully submitted. 



Dennis S. Fernandez, 

REG.NO. 34 J 60 

Fernandez & associates, llp 

PATENT ATTORNEYS 

PO BOX D 

MENLO PARK, CA 94026-6204 

(650) 325-4999 
(650) 325-1203 : FAX 
EMAIL: iploft@iploft.com 
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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Application No.: fNot vet assigned) Patent No.: (Not vet assigned) 

Filed on: U fZ^fzCCO Issued on: (Not v et assigned) 

Title: SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

Inventor(s) : OLIVER, ET AL 

VERIFIED STATEMENT CLAIMING SMALL ENTITY STATUS 

(37 CFR 1.9(f) and 1.27(c) SMALL BUSINESS CONCERN) 

I hereby declare that I am 

the owner of the small business concern identified below: 



X an official of the small business concern empowered to act on behalf of the concern identified below: 

Name of Small Concern: 

DYNAPTICS CORPORATION 

Address of Small Concern: 

Two North Second Street, Suite 400 
San Jose, CA 95113 

I hereby declare that the above identified small business concern qualifies as a small business concern, as 
provided in 37 CFR 1.9(d), for purpose of paying reduced fees to the United States Patent and Trademark 
Office, in that the number of employees of the concern, including those of its affiliates, does not exceed 500 
persons. 

I hereby declare that the rights under contract or law have been conveyed to, and remain with, the concern 
identified above with regard to the invention described in 

X the specification filed herewith, with title and inventor(s) as listed above. 

the application identified above. 

the patent identified above. 

If the rights held by the above identified concern are not exclusive, each individual, concern, or 
organization having rights in the invention is listed below and no rights to the invention are held by any 
person, other than the inventor, who would not qualify as an independent inventor under 37 CFR 1.9(c), it 
that person made the invention, or by any concern which would not qualify as a small business concern 
under 37 CFR 1.9(d), or a nonprofit organization under 37 CFR 1.9(e). 

Each person, concern or organization having any rights in the invention in addition to the above identified 
concern is listed below: 

Name , 

Address - 

Individual X Small Business Nonprofit Organization 



I acknowledge the duty to file, in this application or patent, notification of any change in status resulting in 
loss of entitlement to small entity status prior to paying, or at the time of paying, the earliest of the issue fee 
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or any maintenance fee due after the date on which status as a small business entity is no longer appropriate. 
(37 CFR 1.28(b)) 

I hereby declare that all statements made herein of my own knowledge are true and that all statements made 
on information and belief are believed to be true and further, that these statements were made with 
knowledge that willful false statements and the like so made are punishable by fine or imprisonment, or 
both, under Section 1001 of Title 18 of the United States Code, and that such willful false statements may 
jeopardize the validity of the application, any patent issuing thereon, or any patent to which this verified 
statement is directed. 

George Roumeliotis 

Print Name of Person Signing 

Chief Technology Officer 
Title of Person Signing 

Dynaptics Corporation, , Two North Second Street, Suite 400, San Jose, CA 95113 
Address of Person Siting 

SIGNATURE 4^^fF>^) Date Ij/zz/zooQ 



Application 
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United States Utility Patent 
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SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 



Inventor (s): 

Jonathan James Oliver, residing at 1123 Meredith Avenue, San Jose, CA 95125, a 
citizen of Australia; 

Wray Lindsay Buntine, residing at 1126 Oxford Street, Berkeley, CA 94707, a 
citizen of the United States of America; and 

George Roumeliotis, residing at 1048 Berkeley Avenue, Menlo Park, CA 94025, a 
citizen of Australia. 
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SYSTEM AND METHOD FOR ADAPTIVE TEXT RECOMMENDATION 

R A CKGR OI JND INFORMATION 



Field Of Invention 

Invention relates to a method and system for recommending relevant items to a 
user of an electronic network. More particularly, the present invention relates to a means 
of analyzing the text of documents of interest and recommending a set of documents with 
a high measure of statistical relevancy. 

Description Of Related Art 

Most personalization and web user analysis (also known as "clickstream") 
technologies work with the system making a record of select web pages that a user has 
viewed, typically in a web log. A web log entry records which users looked at which web 
pages in the site. A typical web log entry consist of two major pieces of information, 
namely, first, some form of user identifier such as an IP address, a cookie ID, or a session 
ID, and second, some form of page identifier such as a URL, file name, or product 
number. Additional information may be included such as the page the user came from to 
get to the page and the time when the user requested the page. The web log entry records 
are collected in a file system of a web server and analyzed using software to produce 
charts of page requests per day or most visited pages, etc. Such software typically relies 
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on simple aggregations and summarizations of page requests rather than any analysis of 
the internal page structure and content. 

Other personalization software also relies on the concept of web logs. The 
dominant technology is collaborative filtering, which works by observing the pages of the 
web site a user requests, searching for other users that have made similar requests, and 
suggesting pages that these other users requested. For example, if a user requests pages 1 
and 2, a collaborative filtering system would find others who did the same. If the other 
users on the average also requested pages 3 and 4, a collaborative system would offer 
pages 3 and 4 as a best recommendation. Other collaborative filtering systems use 
statistical techniques to perform frequency analysis and more sophisticated prediction 
techniques using methods such as neural networks. Examples of collaborative filtering 
systems include NetPerceptions, LikeMinds, and WiseWire. Such a system in action can 

be viewed at Amazon.com. 

Other types of collaborative filtering systems allow users to rank their interest in a 
group of documents. User answers are collected to develop a user profile that is 
compared to other user profiles. The document viewed by others with the same profile is 
recommended to the user. This approach may use artificial intelligence techniques such 
as incremental learning methods to improve the recommendations based on user 
feedback. Systems using this approach include SiteHelper, Syskill & Ebert, Fab, Libra, 
and Web Watcher. However, collaborative filtering is ineffective to personalize 
documents with dynamic or unstructured content. For example, each auction in an 
auction web site or item offered in a swap web site is different and may have no logged 
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history of previous users to which collaborative filtering can be applied. Collaborative 
filtering is also not effective for infrequently viewed documents or offerings of interest to 

only a few site visitors. 

Clearly, there is a need for a system that considers not only the identifiers of the 
pages the user viewed but also the words in the pages viewed in order to make more 
focused recommendations to the user. Broadening the concept of pages to documents in 
general, there is a need for a recommendation system that analyzes the words in the 
document a user has expressed interest in. Such a recommendation system should 
support options of residing in the same computer as the web site, or on a remote server, or 
on an end user's computer. Furthermore, the system should be able to access documents 
from external sources such as from other web sites throughout the Internet or from private 
networks. A flexible recommendation system should also support a scalable architecture 
of using a proprietary text search engine or leverage off the search engines of other web 
sites or generalized Internet- wide search engines. 

SUMMARY OF INVENTION 
Invention discloses methods and systems for adaptively selecting relevant 
documents to present to a requestor. A requestor device, either a client working on a PC, 
or a software program running on a server, automatically or manually invokes the 
adaptive text recommendation system (ATRS) and based on extracted keywords from the 
text of related documents, a set of relevant documents is presented to the requestor. The 
set of recommended documents is continually updated as more documents are added to 
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the set of related documents or interest set. ATRS adapts the choice of recommended 
documents based on new analysis of text contained in the interest set, categorizing the 
documents into clusters, extracting the keywords that capture the theme or concept of the 
documents in each cluster, and filtering the entire set of eligible documents in the 
application web site and or other web sites to compile the set of recommended documents 
with a high measure of statistical relevancy. 

One embodiment is an application of ATRS in an e-commerce site, such as a 
seller of goods or services or an auction web site. A client logging onto an e-commerce 
site is greeted with a recommended set of relevant goods, services, or auction items by 
analyzing the text of the documents representing items previously bought, ordered, or bid 
on. As the client selects an item from the recommended set or an item on the web page, 
ATRS updates the documents in the interest set, categorizes the documents in the interest 
set into clusters, extracts keywords from the clusters, and filters the eligible set of 
documents at the web site to construct a recommended set. This recommended set of 
documents is rebuilt possibly every time the client makes a new selection or moves to a 

different web page. 

The recommended set of documents may be presented as a panel or HTML 
fragment in a web page being viewed. The recommendations may be ordered for 
example by the statistical measure of relevancy or by popularity of the item and filtered 

based on information about the client. 

In an alternate embodiment, ATRS may be invoked automatically by a software 
program to develop a recommended set for existing clients not currently logged on. The 
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recommendations may take the form of a notification of select clients for sales, special 
events, or promotions. In other alternate embodiments, the recommendations may take 
the form of a client alert or "push" technology data feed. Similarly, other applications of 
ATRS include notification of clients of upcoming television shows, entertainment, or job 
5 postings based on the analysis of the text of documents associated with these shows, 
entertainment or job openings in which the client has indicated previous interest. 

Additional applications of ATRS include automatic classification of personal e- 
mail, and automatic routing of customer relations e-mail to representatives who 
previously successfully resolved similar types of e-mail. The recommended set may also 

■• «i 

b* 

: y ] 0 consist of Internet bookmarks or subscriptions to publications for a "community of 

1 "I T< 
■ Mm 

j interest" group. Furthermore, the recommended set may be transmitted as a fax, 

*T < %W 

!|i converted to audio, video, or an alert on a pager or PDA and transmitted to the requestor. 

*r *i , >- 

** H 
+m — * 

* The present invention can be applied to data in general, wherein a requestor 

.* 

device issues a request for recommended data comprising documents, audio files, video 

II -J * 
:* "1 

S 15 files or multimedia files and an adaptive data recommendation system would return a 

->-» t& ^ 

■ ' -rt 

recommended set of such data. 

BRIEF DESCRIPTION OF DRAWINGS 
FIG. 1 A- IB are an architectural diagram and flow diagram, respectively, 
20 illustrating an adaptive text recommendation system invoked by a requestor device, in 
one embodiment of the present invention. 
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FIG. 2 is an architectural diagram of the main components or modules of an 
adaptive text recommendation system in one embodiment of the present invention. 

FIG. 3 is a flow diagram of the main components or modules of an adaptive text 
recommendation system in one embodiment of the present invention. 

FIG. 4 is a flow diagram of the assembly processing of ATRS in one embodiment 

of the present invention. 

FIG. 5 is an architectural diagram of the pre-processing of the interest set of 

ATRS in one embodiment of the present invention. 

FIG. 6 is a flow diagram of the pre-processing of ATRS in one embodiment of the 

present invention. 

FIG. 7 is an architectural diagram of the clustering process of ATRS in one 
embodiment of the present invention. 

FIG. 8 is a flow diagram of the keyword extraction process of ATRS in one 
embodiment of the present invention. 

FIG. 9 is a flow diagram of the recommendation processing of ATRS in one 
embodiment of the present invention. 

FIG. 10A is an architectural diagram of ATRS operable in the application website 
whereas FIG 1 OB is an architectural diagram of ATRS operable in a distributed manner 
with segments running at the application website and at a remote site, according to one 
embodiment of the present invention. 
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FIG. 1 1 is an architectural diagram illustrating the deployment of multiple 
applications of ATRS in and outside the United States, according to one embodiment of 

the present invention. 

FIG. 12 is an architectural diagram of an adaptive data recommendation system in 
5 an alternative embodiment of the present invention, illustrating the data requestor device 
invoking and receiving a set of recommended relevant data. 

FIG. 13 is an architectural diagram illustrating the major input and output of an 
adaptive data recommendation system in an alternative embodiment of the present 
invention, illustrating the various types of data that are requested and returned to the 

h" *« 
V «i 

^ 1 0 requestor device. 

I- *» M 

PL »» 4- 

^ *r 

B DETAILED DESCRIPTION OF PREFERRED EMBODIMENT(S) 

y ; 

■■Lt- 

v> 

FIG. 1 A shows how the requestor device 2 invokes either manually or 

•i 

q 15 automatically a request for a set of relevant documents to ATRS 4 which processes the 

request and obtains a set of relevant documents from a document source 6 and returns the 
set to requestor device 2. FIG. IB is a high level flow diagram of ATRS consisting of 
steps where ATRS is invoked manually or automatically by a requestor for a set of 
relevant documents 105 and ATRS returns a set of relevant documents 107. A requestor 
■ 20 may be a client or a software program. A requestor device may be a client personal 
computer. 
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FIG. 2 shows the major modules of one embodiment of the present invention. 
The major modules are: Assembly Module 10, Pre-processing Module 30, Clustering 
Module 40, Keyword Extraction Module 50, Filtration Module 60, Recommendation 
Module 80, and Presentation Module 90. 

The Assembly Module 10 assembles documents from multiple sources into an 
interest set. Documents in the interest set may include documents in a database 
considered of interest to the requestor, web site pages previously viewed by the requestor 
in the application web site or other web sites, documents selected by the requestor from a 
list obtained by a search in the application web site or by an Internet- wide search, e-mail 
sent by the requestor, documents transmitted from a remote source such as those 
maintained in remote servers or in other private network databases, and documents sent 
by fax, scanned or input into any type of computer and made available to the Assembly 
Module 10. For example, in an auction site, the client, presented with a list of live 
auction items, clicks on several auction items that are of interest, then invokes ATRS to 
show a set of recommended auction items. 

The Pre-processing Module 30 isolates the words in the interest set and removes 
words that are not useful for distinguishing one document from another document. 
Words removed are common words in the language and non-significant words to a 
specific application of ATRS . 

The Clustering Module 40 groups the documents whose words have a high degree 

of similarity into clusters. 
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The Keyword Extraction Module 50 determines the keyword score for each word 
in a cluster and selects as keywords for the cluster words with the highest keyword score 
and that also appear in a minimum number of documents specified for the application. 

The Filtration Module 60 uses application parameters for assembling documents 
considered eligible for recommendation. Eligible documents may include documents 
from enterprise databases, documents from private network databases, documents from 
the application web site, and documents from public networks, such as the Internet. 
Furthermore, these documents may cover subjects in many fields including but not 
limited to finance, law, medicine, business, environment, education, science, and venture 
capital. Application parameters may include age of documents and or client data that 
specify inclusion or exclusion of certain documents. 

The Recommendation Module 80 calculates the relevance score for eligible 
documents to a cluster and ranks the eligible documents by relevance score and other 
application criteria. Top scoring documents are further filtered by criteria specific to the 
client. 

The Presentation Module 90 personalizes the presentation format of the 
recommendations for the client. Examples of formats are e-mail, greetings to a site 
visitor, HTML fragment or a list of Internet sites. Any special sorting or additional 
filtration for the client is applied. The recommendations are converted to the desired 
medium, such as voicemail, fax hardcopy, file transfer transmission, or audio/video alert. 

FIG. 3 is a flow chart of one embodiment of the present invention starting with the 
assembly of documents from multiple sources into an interest set 110; pre-processing of 
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the documents to remove "stop" words 112; grouping the documents in the interest set 
into clusters 114; extraction of keywords contained in documents included in the clusters 
1 16; filtration of documents eligible to be considered for recommendation for each cluster 
118; construction of a recommendation set of documents per cluster 120; and presentation 

of the recommendations 122. 

FIG. 4 is a flow chart of the Assembly Module 10 illustrating the process 
involved in assembling all documents which comprise the interest set. Documents 
previously recorded for the client 130 may include previous purchases in a e-commerce 
site, bids in an auction site, or web pages visited by client which contain tags that 
automatically trigger communication to a server of the page or data involved. Documents 
may include those corresponding to the navigation path of the client in the website 132. 
The client may have selected documents from a list of web pages 134 as a result of a site 
search or an Internet-wide search. Other documents may include e-mails, faxed 
document, scanned documents or any other form of document input associated with the 
client 136. Alternatively, documents included may be those transmitted through a 
network for the client 138 where the storage of documents is done remotely. All input 
documents are assembled into an interest set 140. 

FIG. 5 is an architectural chart illustrating the use of the assembled interest set 26 
and the Stop Word Database 32 in the Pre-processing Module 30 to create the refined 
interest set of documents 34. The Stop Word Database 32 comprises words that are not 
useful for distinguishing one document from another document in the interest set. If the 
application language is English, examples would include words such as 'and', 'the', and 
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'etc.' The Stop Word Database 32 also includes words that are common in the interest set 
as a result of the purpose, application or business conducted for the site. For example, on 
an auction site, each web page containing an item description might also contain the 
notice "Pay with your Visa card!" In this case, the words 'pay 5 , 'visa', and 'card' would 

5 be included in the Stop Word Database 32. 

FIG. 6 is a flow chart illustrating the process performed in the Pre-processing 
Module 30 in one embodiment. The process includes isolating words in the documents of 
the interest set and converting the words into a common format 150, such as converting 
the words to lower case. A word is an alphanumeric string surrounded by white space or 

10 punctuation marks. Next, if a word is a common word of the language 152 the word is 
removed 158. If a word is a non-significant word specific to the site and the application 
154, it is also removed 158. Otherwise, the word is retained in the document 156. In one 
embodiment, the common words of the language and the non-significant words specific 
to the application are maintained in the Stop Word Database 32, 

15 FIG. 7 is an architectural chart illustrating the use of the refined interest set 34 and 

processing in the Clustering Module 40 to group the documents into clusters 42, 44, and 
46. Clustering is the process of grouping together documents in the interest set whose 
words have a high degree of similarity. In one embodiment of the present invention, the 
similarity of two documents Di and D 2 is denoted by similarity(Di, D 2 ). If Di does not 

20 contain any words in common with D 2? then: 

similarity(Di, D 2 ) = 0. 
If the two documents have words in common, then: 
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^ count(w, D t ) count(w, D 2 ) 



similarity(Di ? D2) = 



^count^Dj' 

weDj nD 2 



1/2 r 



^count(w,D 2 ) 



1/2 



where count(w, D) denotes the number of occurrences of the word w in the document D, 
and w g Dj n D 2 denotes a word that appears in both Di and D 2 . Many other definitions 

of similarity between two documents are possible. 

The clustering criteria may vary depending on the application of ATRS 4. An 
advantageous implementation involves arranging the documents from the interest set so 



as 



to maximize the cluster score, wherein the cluster score of a cluster containing only 



one document is zero and the cluster score for a cluster containing more than one 
document is the average similarity score between the documents in the cluster. 

The clustering algorithm can be any one of well-known clustering algorithms that 
can be applied to maximize the clustering criterion, such as K-Means, Single-Pass, or 
Buckshot, which are incorporated by reference. 

FIG. 8 is a flow diagram of the keyword extraction processing of ATRS 4 in one 
embodiment of the present invention. For each word w in a cluster C, calculate the 
frequency of the word w in the interest set, Frequency(w); and calculate the frequency of 
the word w in cluster C, Frequency(w, C) 180. Calculate the keyword score for word w 

in the cluster C 182, using the equation: 

Keyword score(w, C) = log Frequency(w, C) - log Frequency(w). 



DYNA-P005 



13 



Select keywords for cluster C based on application criteria 184; for example, select 
keywords that have high scores and appear in several documents. Upon processing all 
clusters 186, the system proceeds to the balance of processing. In an alternative 
embodiment of the present invention, the keywords describing the theme or concept in a 
cluster do not necessarily appear in the text of any document, but instead summarize the 
theme or concept determined, for example, by a method for natural language 
understanding. 

FIG. 9 is a flow diagram of the recommendation processing of ATRS 4 in one 
embodiment of the present invention. For each eligible document D, count the number of 
times the keyword w e keywords(c) appears 190. Calculate the relevance score of 
document D to cluster C using the equation: 

V count(w, D) 
relevance(D, C) = p ~*>>™*V — 

^count(w, D) 2 

wekeywords(C) 

where w e keywords(C) denotes one of the keywords of cluster C. 
Rank eligible documents by relevance score and other application criteria 194. Retain top 
scoring documents and apply other filtration criteria specific to this client 196. For 
example, the client may only want documents created within the last seven days. 
At the completion of all clusters 198, the system proceeds to the balance of processing. 

The presentation of recommendations may be through a set ordered by relevance 
score, set ordered by popularity of document, a greeting to a site visitor, a notification of a 
sale, event, or promotion, a client alert, for example, a sound indicating presence of a new 
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document, or a new article obtained from a newswire as in "push" data feed delivery 
methods, notification of TV shows and entertainment based on processing the 
descriptions of previously viewed TV programs or purchased tickets for entertainment 
shows. Hard copy formats in the form of postcards, letters, or fliers may also be the 

medium of presentation. 

Another embodiment of the present invention is conversion of the 
recommendation set of documents into files for faxing to the client, conversion to voice 
and presenting it as a voicemail, a pager or audio or video alert for the client. 
Advantageously, such recommendations can be sent through a network and stored for 
later retrieval. In another embodiment, the system may serve a "community of interest" 
like a wine connoisseur's Internet list or chat room where the recommendation may 
consist of the popular magazines or web pages viewed by experts of the community of 
interest. Alternatively, the recommendation may be presented to the client or requestor as 

a set of Internet bookmarks. 

There are several alternative embodiments of the present invention. In a 
document classification application, customer e-mails sent to a company's customer 
service representative (CSR) department can be routed to the CSR that had successfully 
resolved similar e-mails containing the same issues. A similar application is the 
automatic classification of personal e-mail wherein ATRS processes e-mails read and or 
responded to by the client, applying the clustering/ keyword extraction/ filtering/ 
recommending steps to present the recommended e-mails to the client, treating the rest as 
miscellaneous. The client may further specify presentation of the top ten e-mails only, a 
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very useful feature for e-mail access on wireless devices. Other classification 
applications are automatic routing of job postings to a job category, and automatic 
classification of classified advertisements or offers for sale or offers to swap items or 
services. 

Other applications of ATRS involve research either in the Internet or in enterprise 
databases. For example, a client may be interested in "banking". Instead of sifting 
through multitudes of documents that contains "banking", the client may "mark" several 
documents and invoke ATRS to present a set of recommended documents with a high 
measure of statistical relevance. This research may be invoked on a periodic basis 
wherein ATRS presents the recommended set of documents to the client in the form of a 
notification or to clients in the "community of interest" application. 

In another application of ATRS, online auction participants who have lost an 
auction are sent e-mail or other notification containing a list of auctions that are similar to 
the one they lost. This list is generated based on textual analysis of the description of the 
lost auction. 

Another application of ATRS involves analyzing the text of news stories or other 
content being viewed by a site visitor and displaying a list of products whose descriptions 
contain similar themes or concepts. For example, a visitor to a web site featuring stories 
about pop stars might read an article about Madonna and be presented a list of Madonna- 
related products such as musical recordings, clothing, etc. The presentation of the 
recommended products might be done immediately as the site visitor is browsing, or upon 
returning to the web site, or in an e-mail, or other delayed form of notification. 
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Similarly, ATRS can work in conjunction with a regular search engine to narrow 
the results to a more precise recommended set of documents. In one embodiment, ATRS 
4 is a front-end system of a network search engine. ATRS 4 analyzes the text of an 
interest set of documents, groups the interest set of documents into clusters; extracts 
keywords from the text of the documents grouped into the clusters; and communicates the 
selected keywords of the clusters to the search engine. The search engine uses these 
keywords to search the network for documents that matches the keywords and other 
filtering criteria that may be set up for the application. 

FIG. 10A is an architectural diagrams where the requestor device 2 may be a PC 
used by a client to access a website and ATRS 4 is manually or automatically invoked 
upon accessing the site. The document source 6 may be at the website or may be the 
entire Internet. FIG. 10B shows an alternative embodiment of the present invention 
wherein the requestor device 2 is essentially unchanged but the application website 300 
for ATRS 4 only hosts the ATRS shell 300 or application proxy and the ATRS modules 
305 are operable in a remote site. Document source 6 may be operable in a distributed 
manner at the same or different remote site as the ATRS modules 305. Alternatively, 
document source 6 may be the entire Internet. 

FIG. 1 1 is an architectural diagram illustrating the deployment of multiple 
applications of ATRS 4 in and outside the United States, according to the present 
invention. Requestor device 1 310, is in the United States, and Requestor device 2 312, is 
located outside of the United States. Requestor device 1 310 and Reguestor device 2 312, 
are coupled to ATRS 1 314 in the United States and or ATRS 2 316 located outside of the 
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United States. Document Source 1 318 is in the United States whereas Document Source 
2 320 is outside the United States and both are coupled to and provide eligible documents 

for ATRS 1 314 and or ATRS 2 316. 

FIG. 12 is an architectural diagram of an adaptive data recommendation system in 
an alternative embodiment of the present invention, illustrating the data requestor device 
330 invoking and receiving a set of recommended relevant data from an adaptive data 
recommendation system 332 using data source 334. 

FIG. 13 is an architectural diagram illustrating the major input and output of an 
adaptive data recommendation system in an alternative embodiment of the present 
invention, illustrating the various types of data that are requested and returned to the 
requestor device. A document interest set 340, audio interest set 342, a video interest set 
344, and or a multimedia interest set 346 are accessed by an adaptive data 
recommendation system 332, utilizing a data source 334, a client database 348, and 
application parameters 358 to create a recommended data set comprising document 
recommended set 350, audio recommended set 352, video recommended set 354, and 
multimedia recommended set 356. As an example, based on the description of various 
artists and their singing styles, a requestor device may specify certain singers with the 
type of songs and lyrics desired, an adaptive data recommendation system would cluster 
the songs and artists, extract keywords of the lyrics or key notes or note patterns in the 
artists' songs, and search sites containing libraries of artists and songs, and select for 
recommendation the downloadable songs relevant to requestor' s criteria. The 
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recommendation could be streaming audio or streaming video that can be played at the 
requestor device. 

One implementation of the present invention is on a Linux OS running Apache 
web server with a MySQL database. However, a person knowledgeable in the art will 
readily recognize that the present invention can be implemented in different operating 
systems, different web servers with other types of data bases but not limited to Oracle and 
Informix. 

A person knowledgeable in the art will readily recognize that the present 
invention can be implemented in a portable device comprising a controller; memory; 
storage; input accessories such a keyboard, pressure-sensitive pad, or voice recognition 
equipment; a display for presenting the recommended set; and communications 
equipment to wirelessly-connect the portable device to an information network. In one 
embodiment, the ATRS computer readable code can be loaded into the portable device by 
disk, tape, or a hardware plug-in, or downloaded from a site. In another embodiment, the 
logic and principles of the present invention can be designed and implemented in the 

circuitry of the portable device. 

Foregoing described embodiments of the invention are provided as illustrations 
and descriptions. They are not intended to limit the invention to precise form described. 
In particular, it is contemplated that functional implementation of the invention described 
herein may be implemented equivalently in hardware, software, firmware, and/or other 
available functional components or building blocks. 
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Other variations and embodiments are possible in light of above teachings, and it is thus 
intended that the scope of invention not be limited by this Detailed Description, but rather 
by Claims following. 
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CLAIMS 

What is claimed is: 

1 . A method for adaptive text recommendation, the method comprising: 

receiving a query; and 

adaptively changing the query result in response to the query. 

2. The adaptive text recommendation method of Claim 1 wherein the changing 
step comprises clustering of an interest set of documents into one or more clusters; 
extracting keywords for the one or more clusters that represent the theme of the 
documents in the one or more clusters; filtering of an eligible set of documents to meet 
application criteria; and adaptively constructing a recommended set of documents for 
each cluster of the one or more clusters. 

3 . The adaptive text recommendation method of Claim 2 wherein the clustering 
step further comprises assembling the interest set of documents; pre-processing words of 
the interest set of documents; and grouping of documents from the interest set of 
documents into the clusters utilizing a clustering algorithm that maximizes the cluster 
score of the clusters. 

4. The adaptive text recommendation method of Claim 3 wherein the assembling 
step comprises collecting documents previously viewed by a client; collecting e-mails 

21 

DYNA-P005 



that elicited a response from the client; collecting documents describing items previously 
bought by the client; collecting documents describing items the client made a bid on; 
collecting documents associated with selections from a list of documents, the selections 
being made by the client; collecting pages of web sites wherein the client indicated 
interest; collecting documents recorded for the client; and collecting documents 
associated with a client transmitted from a remote source. 

5. The adaptive text recommendation method of Claim 3 wherein the pre- 
processing step comprises removing common words in the language used in the 
application; and removing words which are not significant for the application. 

6. The adaptive text recommendation method of Claim 2 wherein the extracting 
keywords step utilizes a process that calculates the keyword score of the cluster and select 
keywords that maximizes the keyword score of the cluster. 

7. The adaptive text recommendation method of Claim 2 wherein the eligible set 
of documents comprises documents from an application web site; documents from other 
web sites; documents from private databases; and documents selected from the Internet 
using a search process. 

8. The adaptive text recommendation method of Claim 2 wherein the 
construction of the recommended set of documents further comprises calculating a 
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relevance score of each document in the eligible set of documents; selecting documents of 
the eligible set of documents with high relevance scores; and applying other selection 
criteria comprising popularity of the document in the eligible set of documents and client 
preference for the document in the eligible set of documents. 

9. The adaptive text recommendation method of Claim 2 further comprising 
presenting the recommended set of documents using a presentation technique that 
comprises sending an e-mail, displaying a greeting, displaying an HTML fragment, 
sending a fax, sending a voicemail, sending a video alert, sending an audio alert, and 
transmitting a file representing the recommended set of documents. 

10. The adaptive text recommendation method of Claim 1 wherein the received 
query comprises a request from a requestor device enabled by an action of the client and a 
software request. 

11. The adaptive text recommendation method of Claim 10 wherein the action of 
the client enabling the query request comprises logging onto a web site that automatically 
generates the query; manually requesting the query; and making a selection at the web site 
that generates the query. 

12. A method of adaptive offer recommendation, the method comprising: 
receiving a query for an offer; and 
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adaptively changing the recommended set of offers in response to the query. 

13. The adaptive offer recommendation method of Claim 12 wherein the 
changing step comprises clustering of an interest set of offer descriptions into clusters; 
extracting keywords for the clusters, the keywords representing the theme of the offer 
descriptions of the clusters; and filtering of an eligible set of offer descriptions to meet an 
application criteria; and adaptively constructing a recommended set of offer descriptions 
for the clusters. 

14. The adaptive offer recommendation method of Claim 13 wherein the eligible 
set of offer descriptions comprises offer descriptions of auction items, items for sale, 
items for swap, job openings, items to buy, and services for sale. 

15. The adaptive offer recommendation method of Claim 13 wherein the 
construction of the recommended set of offer descriptions further comprises calculating a 
relevance score of each offer description in the eligible set of offer descriptions; selecting 
a plurality of offer descriptions of the eligible set of offer descriptions with high relevance 
scores; and applying other selection criteria comprising popularity of the offer description 
in the eligible set of offer descriptions and client preference for the offer description in 
the eligible set of offer descriptions. 

16. An adaptive text recommendation system comprising: 
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a query receiving processor for receiving a query; 

a database for storing a plurality of document records including Internet document 
records, private document records, and other public network document records; 
a query response processor for sending a response to the query; and 
an adaptive text processor, coupled to the query receiving processor, the database; 
and the query response processor, for receiving the query from the query receiving 
processor, analyzing the text of an interest set of document records from the database, 
grouping the interest set of document records into clusters; extracting keywords from the 
text of the document records grouped into the clusters, filtering the eligible set of 
document records from the database to meet an application criteria; and adaptively 
constructing the recommended set of document records for the clusters, and passing the 
recommended set of document records to the query response processor. 

17. The adaptive text recommendation system of Claim 16 wherein the eligible 
set of document records comprises document records from the application web site; 
document records from other web sites; document records from private databases; and 
document records selected from the Internet using a search process. 

18. The adaptive text recommendation system of Claim 16 further comprising a 
database update processor for updating the interest set of documents with new 
documents. 
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19. The adaptive text recommendation system of Claim 16 wherein the adaptive 
text processor is operable in a distributed manner at a remote location. 

20. A computer storage medium storing the computer readable code for causing a 
computer system to execute the steps of an adaptive text recommendation system, the 

steps comprising: 

clustering of an interest set of documents into clusters; 

extracting keywords for the clusters, the keywords representing the theme or 
concept of the documents of the clusters; 

filtering of an eligible set of documents to meet application criteria; 

adaptively constructing the recommended set of documents for the clusters; and 

presenting the recommended set of documents. 

21. An adaptive data recommendation system comprising: 

a query receiving processor for receiving and processing a query; 

a database for storing a plurality of data description records including Internet data 

description records, private data description records, and other public network data 

description records; 

a query response processor for sending a response to the query; and 

an adaptive data processor, coupled to the query receiving processor, the database; 

and query response processor, for receiving the query from the query receiving processor, 

analyzing the text of an interest set of data description records from the database, 
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grouping the interest set of data description records into clusters; extracting keywords 
from the text of the data description records grouped into the clusters, filtering the 
eligible set of data description records from the database to meet an application criteria; 
and adaptively constructing the recommended set of data description records for the 
clusters, and passing the recommended set of data description records to the query 
response processor. 

22. An adaptive text recommendation apparatus comprising: 

a computer comprising input means for entering a query and display means for 

presenting a recommended set of documents; and 

communications means for wirelessly coupling the computer to an information 

network, the information network containing at least an interest set of documents and an 

eligible set of documents; 

wherein the query entered through the input means of the computer enables the 
computer to wirelessly connect to the information network and to execute the steps of an 
adaptive text recommendation system, the steps comprising clustering of the interest set 
of documents into clusters; extracting keywords for the clusters, the keywords 
representing the theme or concept of the documents of the clusters; filtering of the 
eligible set of documents to meet an application criteria; constructing a recommended set 
of documents; and presenting the recommended set of documents using the display means 
of the computer. 
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23. A method for adaptively classifying documents, the method comprising: 
clustering of an interest set of documents into clusters; 

extracting keywords for the clusters that represent the theme of the documents of 
the clusters; 

filtering of an eligible set of documents to meet application criteria; 
constructing a recommended set of documents; and 

presenting the recommended set of documents using a presentation technique that 
comprises sending an e-mail, displaying personal e-mail, displaying a greeting, displaying 
an HTML fragment, sending a fax, sending a voicemail, sending a video alert, sending an 
audio alert, and transmitting a file representing the recommended set of documents. 

24. An adaptive document search system comprising: 

a query receiving processor for receiving a search query; 

a database for storing an interest set of document records comprising Internet 
document records, private document records, and other public network document records; 

a search engine, for searching the database for documents matching a search 
criteria; 

a search query response processor for sending a response to the search query; and 
an adaptive text processor, coupled to the query receiving processor, the database, 

the search engine, and the query response processor; 

wherein the adaptive text processor, upon receiving the search query from the 

query receiving processor, analyzes the text of the interest set of document records from 

28 

DYNA-P005 



the database, groups the interest set of document records into clusters; extracts keywords 
from the text of the document records grouped into the clusters, and communicates the 
extracted keywords to the search engine; and 

wherein the search engine searches the database for documents matching the 
search criteria comprising the communicated keywords from the adaptive text processor. 
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ABSTRACT 



Network system provides a real-time adaptive 
recommendation set of documents with a high statistical 
measure of relevancy to the requestor device. The 
recommendation set is optimized based on analyzing the 
text of documents of the interest set, categorizing these 
documents into clusters, extracting keywords representing 
the themes or concepts of documents in the clusters, and 
filtering a population of eligible documents accessible to 
the system utilizing site and or Internet- wide search 
engines. The system is either automatically or manually 
invoked and it develops and presents the recommendation 
set in real-time; for example, upon logging onto a web site 
or as the client views additional documents or pages of a 
website. The recommendation set may be presented as a 
greeting, notification, alert, HTML fragment, fax, 
voicemail, or automatic classification or routing of 
customer e-mail, personal e-mail, job postings, and offers 
for sale or exchange. 
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